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Abstract. XSLT is an increasingly popular language for processing 
XML data. It is widely supported by application platform software. How- 
ever, little optimization effort has been made inside the current XSLT 
processing engines. Evaluating a very simple XSLT program on a large 
XML document with a simple schema may result in extensive usage of 
memory. In this paper, we present a novel notion of Streaming Processing 
Model (SPM) to evaluate a subset of XSLT programs on XML docu- 
ments, especially large ones. With SPM, an XSLT processor can trans- 
form an XML source document to other formats without extra memory 
buffers required. Therefore, our approach can not only tackle large source 
documents, but also produce large results. We demonstrate with a perfor- 
mance study the advantages of the SPM approach. Experimental results 
clearly confirm that SPM improves XSLT evaluation typically 2 to 10 
times better than the existing approaches. Moreover, the SPM approach 
also features high scalability. 

1 Introduction 

As XML has continued to gain popularity as a standard for information rep- 
resentation and exchange, tools to process XML are increasingly supported by 
common application platforms. Many of them implement XSLT [12]. As a trans- 
formation language, XSLT has proven to be very popular with developers and 
is often implemented as a stand-alone tool. Unlike XQuery[ll], XSLT was not 
designed as a full-functional query language. Nevertheless, XSLT can easily be 
used for query-like transformations. 

When processing XML documents, most prevalent XSLT processors try to 
keep the entire data structure of DOM or DOM-like models in main memory. The 
size of DOM in the memory can be an order of manitude larger than that of the 
original XML file. Therefore, when dealing with XML files, if their size is com- 
parable to or even larger than the main memory, XSLT processors will thrash 
due to excessive use of virtual memory. In this case, their efficiency degrades 
drastically. These XSLT processing engines make little optimization effort. As 
observed in our experiments, even with respect to a rather simple XSLT pro- 
gram, the evaluation process will be incredibly slow as long as the source XML 
document is larger than the main memory. 

* This work is supported in part by the National Hi- Tech Research and Development 
Program of China under Grant No. 2002AA116020 and by the National Natural 
Science Foundation of China under Grant No. 60228006. 



In this paper, wc present a novel notion of Streaming Processing Model 
(SPM) to evaluate a subset of XSLT programs on XML documents, especially 
large ones. In the SPM model, given a DTD T>, an XSLT program C is converted 
to many handlers for SAX-like events. When an XML file is read in, events will be 
fired. At first, the result document is empty. With the advancement of the scan- 
ning process, the output fragments generated by handlers of the events will be 
appended to the result document continually. Users obtained the result as soon 
as the scanning process was finished. With our SPM model, an XSLT engine 
can transform an XML source document to other formats without extra mem- 
ory buffers required. Therefore, our approach can not only tackle large source 
documents, but also produce large results. 

Contributions. In this paper, we made several contributions. First, we pro- 
posed the SPM model to evaluate a subset of XSLTs on XML data. With this 
model, XSLT can be evaluated without extra buffers required. The content of 
the result document can be delivered continuously before all the source data has 
been processed. 

Second, we introduced transformation trees, which incorporate the schema 
information of the source document and the XSLT transformation. We also de- 
vised algorithms to build a transformation tree and then convert it into an SPM 
model. 

Third, with a performance study, we demonstrated the advantages of the 
SPM approach. Experimental results clearly confirmed that it improves XSLT 
evaluation typically 2 to 10 times faster than the current approaches, and also 
features high scalability. 

2 From DTD and XSLT to Transformation Tree 

In this section, first we give the definition of simple DTDs and XSLT core , then 
introduce transformation trees. 

DTD is well-known now. For simplicity, we consider a set of simple DTDs in 
this paper. A simple DTD T> contains no IDs or IDREFs, and is acyclic. Further, 
each element type has only one parent element type. That is to say, in the DTD 
V, if the element type C is a child of the element type A, then it cannot be a 
child of the element type B at the same time. Given a DTD, the last point can be 
easily satisfied by renaming some element names. These assumptions guarantee 
that V can be represented as a DTD tree. 

XSLT is a language for transforming XML documents into other formats. In 
this paper, we do not consider the complete XSLT. We define a restricted subset 
of XSLTs that we term XSLT core , which is described as following. 

XSLT core . An XSLT core program C is a set of template rules r^, each of 
which is a pair (pattern(ri) , template(ri)) , where pattern(n) is the match pat- 
tern of n, template(ri) is the output template of n to form part of the result. 
The output template is explained below. 

Output Template. An output template is a sequence (oi, 02, . . . , o n ) of two 
kinds of constructs: one is constant strings, the other is apply-templates. That is 



to say, Oi is either a constant string or an apply-template. We also assume that 
adjacent constant strings have been merged together into one longer string. For 
each i, o\ and o i+ i cannot be constant strings simultaneously. 

Combined with the definition of simple DTDs, our definition of XSLT core 
further guarantees that each template rule or apply-template can only match 
or select a single node or a set of sibling nodes in the source document. As 
said above, simple DTDs can be represented as DTD trees. In a DTD tree, 
each node represents an element type, while each edge indicates the parent/child 
relationship between two element types. Given a simple DTD V and an XSLT core 
program £, C is evaluated on any instance X of V. Each template rule in C will 
be matched against a single node in the DTD tree of V; and each apply-template 
in template rules will select a single node of the DTD tree. 



<xsl : template match = "/"> 

<html><hcad>< title >Books Information </t it lc ></head> 
<body><t ab le > 

< x s 1 : apply — templates select =" publication / book"/> 
</tablc ></body> 
</html> 
</xsl : template > 

<xsl : template m at ch = " boo k" > 

<t r ><td><xsl : apply— templates select=" t itlc"/></ td > 
<td><tablc> 

<xsl : apply— templates select =" author "/> 
</tablc ></td> 
</tr > 
< / x s 1 : template > 

<xsl : template m at ch = " a u t h o r " > 

<tr><td><xsl : a p p 1 y — t c m p 1 a t c s s e 1 e c t =" name" / >< / 1 d ></ 1 r > 
</xsl:tcmplatc> 

< x s 1 : template m at ch = " title "> 
<xsl: value— of select="."/> 
</ x s 1 : template > 

<xsl : template m at ch = " name" > 

<xsl : value— of select ="."/> 
</xsl : tcmplatc> 



Fig. 1. An example of XSLT. 



<?xml version ="1.0" e n c O d i ng ="UTF— 8" ?> 
<publication> 

<book><titlc >A Complete Guide to DB2 Universal D at ab as c </ t i t 1 c > 
<isbn >l-55860-482-0< / isbn> 

< aut hor ><name>Don Chamberlin </name></ aut h o r > 
</book> 
</publication> 



Fig. 2. An XML document describing books information. 



Fig. 1 presents an XSLT core example. Fig. 2 and 3 shows an XML document 
and its DTD respectively, while the DTD tree is illustrated in Fig. 4. They will 
be used throughout this paper as a running example. In order to integrate DTDs 
with XSLT core programs, we introduce a new data structure, called transforma- 
tion tree, which is an extended DTD tree. An example of transformation tree 



publication (book*)> 

book (title , isbn , author*)> 

title ( # PCDATA) > 

isbn (#PCDATA)> 

author (namc)> 

name (#PCDATA)> 



Fig. 3. The DTD of the XML document in Fig. 2. 




Fig. 4. A DTD tree. Fig. 5. An transformation tree. 



is depicted in Fig. 5, which combines the information from the DTD in Fig. 3 
and the XSLT in Fig. 1. In a transformation tree, a circle indicates an element 
type, while a rectangle indicates a constant string from the XSLT program, and 
a dashed line represents an invocation of certain apply-template. 

Given the schema of the XML source document, denoted by its DTD in 
this paper, and an XSLT core program, we devise algorithms to generate the 
corresponding transformation tree, which takes both the schema information 
and the XSLT core into consideration. Next we define transformation tree, and 
present some notations. 

Transformation Tree. Given a simple DTD V and an XSLT core program 
C, the corresponding transformation tree T is a rooted ordered tree. It consists 
of two kinds of nodes: element nodes and constant string nodes. Element nodes 
are indicated by circles, among which there is a distinguished root node, while 
rectangles indicate constant string nodes. There are three kinds of edges in T. 
The first kind indicates the parent/child relationship between two element types, 
and is illustrated by solid lines. As illustrated by dot-dashed lines, the second 
kind of edges connects element nodes and constant string nodes. The third kind 
represents the calling/being-called relationship, and is illustrated by dashed lines 
in the diagram. The first kind of edges can be derived from the DTD, while the 
XSLT program introduces the latter two. 

An XSLT program C is a set of template rules r*, which consists of two 
parts, match pattern pattern(ji) and output template template^) . Under the 
restrictions imposed on XSLT core , the pattern(ri) will be matched against a 
single node m in the DTD tree. We call m the matched node of n, denoted by 
mnode(ji). As discussed above, template(ri) is a sequence (on, Oj2, • • • , Ojt) of 
constant strings and apply-templates. If Oij (1 < j < t) is an apply-template, 
then it will also select a single node riy in the DTD tree, denoted by selnode(oij) . 
We call nij the selected node of the apply-template . 



< ! ELEMENT 
< ! ELEMENT 
< ! ELEMENT 
< ! ELEMENT 
< ! ELEMENT 
< ! ELEMENT 



3 Streaming Processing of XSLT 



In this section, we describe the streaming processing model of XSLT evaluation 
adopted in our work, and present the algorithms for converting a transformation 
tree to a streaming processing model. 

Determining whether an XSLT program can be processed in a streaming fash- 
ion or not is not easy. There are many subtle issues in the decision. For brevity, 
before we give the strict definition of a streamable XSLT, an XSLT program 
is said to be streamable, if it can be evaluated without extra buffers required. 
In order to have a more clear understanding of this problem, we first give sev- 
eral positive and negative cases, shown in Fig. 6, and present an enlightening 
discussion on them. 




(a) (b) (c) (d) 

Fig. 6. Streamable vs. unstreamable XSLTs. 



Let T>i and £j denote the corresponding DTD and XSLT program of the 
transformation tree^, for 1 < i < 4. V 1 is the same as T> 2 , as shown below. 



< [ELEMENT 


A 


(B, C)> 


< [ELEMENT 


B 


(#PCDATA)> 


< [ELEMENT 


C 


(#PCDATA)> 



The root node of type A has two children, one being of type B, and the other 
of type C. C\ is rather simple. It has only one non-trivial template rule r, the 
matched node of which is the root node. Aside from constant strings 1 , r's output 
template involves two apply-templates. The first apply-template will select the 
child node of type B while the second will select the child node of type C. 
Comparing Fig. 6(b) with Fig. 6(a), we see that the order of two apply-templates 
is changed. When transforming an XML document conforming to £>i, C\ can 
be evaluated without buffers required, while £2 cannot. The latter needs extra 
buffers. The reason for this is that the order (B, C) in the XML streaming is 
opposite to the order (C, B) in the result tree of £2 • If no buffer can be used, 
when the B node comes, £1 will output something, while £2 will do nothing; 
when the C node comes, both C\ and £ 2 will output the PCDATA value of this 
node. After that, £1 is well done. However, £ 2 still has to access the node B, 



1 They are not shown in Fig. 6. 



which has flowed away long before. Therefore, £ 2 is not a streamable one with 
respect to £> 2 - 

£3 and £4 indicated by Fig. 6(c) and 6(d) is a bit more complicated than 
the former pair. In fact, £ 3 is the same as £4, but they are applied to XML 
documents conforming to different DTDs. There is only a minor difference be- 
tween T>z and D4. In £>3, A has a child B, which appears only once; while 
in V4, the edge from A to B is labelled by a V, which indicates that in the 
source document, the node of type A can have zero or more children of type B. 
In the case of Fig. 6(c), part of the result document would be something like 
"<C>stringi</C> <D>string2</D>" , which is the same as the internal order of 
the source document. However, in the case of Fig. 6(d), part of the result would be 
"<C>string!</C> . . . <C>string 2 </C> <D>string 3 </D> . . . <D>string 4 </D>" . 
The order of these nodes appearing in the source document is "<C/> <D/> . . . 
<C/> <D/>", i.e., C nodes and D nodes occur alternately. We cannot obtain 
the sequence (C, C, . . . , D, D) from the sequence (C, D,C, D, . . . , C, D), if only 
a single pass of the original sequence is allowed and no memory buffers can be 
made use of during the transformation process. 

Next we present the definition of a streamable XSLT. 

Streamable XSLT. Given a simple DTD T>, an XSLT core program £ is said 
to be streamable if it satisfies that, for any XML instance X of T>, C can always 
be evaluated successfully on I to produce the correct result document with no 
memory buffers required. 

Now it is the appropriate time to introduce the definition of our streaming 
processing model. When an XML document is scanned from its beginning to 
its end, a series of events will be emitted. These events can be classified into 
two categories, element-start and element-end events. For simplicity, we do not 
consider attributes of elements in this work, or assume that we have replaced 
them by sub-elements and modified the DTD correspondingly. Next we describe 
the streaming processing model, shortened as SPM. 

Streaming Processing Model (SPM). For each event e, no matter being 
element-start or element-end one, there is an output fragment attached to e. 

To each element-start event, a constant string cstr s , which may be empty, is 
attached. For a non-leaf node, there is also a constant string cstr e attached to 
its element-end event. These strings constitute the output fragment of the event 
e. 

Different from the above cases, a tri-tuple (cstr e i, pedata, cstr e2 ) is attached 
to the element-end event of each leaf node, here pedata field is either null or the 
PCDATA value of the leaf node. In this case, the concatenation of cstr e \, pedata 
and cstr e 2 forms the output fragment of the corresponding event. 

When an XSLT program £ is evaluated on a source document, a sequence of 
events will be fired. The concatenation of their output fragments can be proven 
to be the result document. With the advancement of scanning process, output 
fragments can be appended to the result. Hence, the result can be delivered 
continuously. Finally, when the scanning process finished, the result document 
was obtained. 
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Fig. 7. Some concrete cases of the BuildSPM algorithm. 



As can be seen from Fig. 6, given a DTD £>, an XSLT core program C may be 
streamable or not w.r.t. V. In other words, it can be converted to a streaming 
processing model or not. Thus, we have to consider what XSLT programs can 
be converted and how they can be converted. 

We devise the Convert procedure which tries to convert an XSLT pro- 
gram into a streaming processing model. If it reports a failure, this means that 
the XSLT program cannot be successfully converted to a streaming processing 
model. In this paper, we will not separately present the algorithm for testing the 
streamability of an XSLT program, and we deem that checking by theoretical 
proofs is feasible. In fact, that is a major part of our future work. 

The parameters of the algorithm Convert are an DTD V and an XSLT 
program C, and it returns a streaming processing model M.. It first builds a 
transformation tree T from T> and C, then builds the corresponding SPM model 
M through the procedure BuildSPM, which is illustrated in Alg. 1. Before 
calling the procedure BuildSPM, it initializes output fragments of the element- 
start and element-end events of each node n in the transformation tree to empty 
strings. 

The input parameter of BuildSPM is a node n of the transformation tree. 
The first time the BuildSPM procedure is called by the procedure Convert 
rather than by itself recursively, the root node of the transformation tree T 
is passed to n. And during the entire recursive process, A4 is a global vari- 
able. Let start(n), end(n) denote the output fragments of the element-start and 
element-end events of the node n, respectively. start(n) may be modified by 
BuiLDSPM(no) or BuiLDSPM(n), here n is the parent node of n. In Build- 
SPM(no), something may be pre-appended to start (n); while in BuiLDSPM(n), 
more may be post-appended to it. In the pseudo-code, the operator '+' implies 
concatenation of strings, e.g. end(c) <— end(c) + dj+i, or addition of integer 
variables. The analysis on end(n) is similar, thus we do not detail it further. 
We do not explain the algorithm BuildSPM line by line. Instead, we present 
several illustrations to facilitate understanding. Concrete cases related to lines 
5-7, 11-12, 13, 17-18 and 19-20 of Alg. 1 are shown in Fig. 7(a), 7(b), 7(c), 7(d) 
and 7(e), respectively. 

As a complete example, the transformation tree in Fig. 5 will be converted 
to the following SPM model. 



start ( publication) = "<html >... < table>" 



Algorithm 1 BuildSPM 



Input: node n 
1: if (n is a leaf node) then 
2: end(n) <— pcdata + end(n) ; return 

3: {cj (1 < j < p) to denote children of n derived from DTD.} 
4: {d; (1 < i < q) to denote children of n introduced by XSLT.} 
5: if di is a constant string node 
6: start(n) <— start(n) + di; i <— 2 
7: else i <— 1 
8: while i < g 

9: c <— the child of n along the path from n to di 
10: if the edge (n, c) is not labelled by '*' 
11: if di+i is a constant string node 
12: end(c) <— end(c) + i <— i + 2 

13: else i <— i + 1 
14: else 

15: if cij+i is a constant string node 

16: c <— the child of n along the path from n to di+2 

17: if (n, c') is labelled by '*' 

18: Err("not streamable!") 

19: else 

20: start(c') <- + start(c'); i <- i + 2 

21: else i <— i + 1 

22: for each di of element node type 

23: BuildSPM(d l ) 



cnd( publication) = "</tablc >... </html>" 
start (book) = "<tr ><td >" 
cnd(book) = "</tablc></td></tr>" 

end ( t i t 1 c )=concat (PCDATA value of title , " < / 1 d >< t d >< t a b 1 c > " ) 
begin ( author) = "<tr ><td >" 

end ( namc)= concat (PCDATA value of name. "</td><tr>") 

Note that trivial output fragments are not shown here, such as start(title)="", 
etc. 

As can be understood from the above SPM model, an output fragment is at- 
tached to each element-start and element-end event. During the scanning process 
of the source document, many output fragments will be generated. Regarding to 
streamable XSLT programs, no extra memory buffer is required when evaluating 
them on source documents. Before all the source data has arrived, the head part 
of the result can be delivered. Even a large XML data instance can be processed 
with only a single pass. 

4 Experimental Results 

In this section, we report the experimental results. We examined the perfor- 
mance of our SPM approach on XML data of different sizes, and compared it 
to several publicly available XSLT processing engines. We evaluated an XSLT 
transformation Co on the DBLP XML documents. Below we present the result 



for Co that transforms a DBLP XML document into another schema: for each 
conference paper, an HTML table row is generated, listing all authors of this 
paper in a nested table, followed by the title of the paper. Co is much like a 
query, extracting part of data, then tagging them in a different way from the 
source document. Due to space constraint, Co is not shown here. 

Our experiments were carried out on a PC with an Intel PIV 1.8GHz proces- 
sor and 256MB of RAM, running MS Windows 2000 Server. MS command line 
XSLT tool vl.l was used. And Sun JDK vl.4 was used as the Java runtime envi- 
ronment. Xalan-j v2.5.1 and saxon v6.5.3 were also used for the experiments. Our 
algorithms related to the SPM model are all implemented in JDK 1.4. We used 
varying sizes of XML source documents from the well-known DBLP database. 




(a) (b) 
Fig. 8. Evaluation time vs. size of DBLP source documents. 



We conducted two groups of experiments. The first group is a comparative 
study. In it, we compare the performance of Xalan-j, Saxon, MS XML and SPM 
on DBLP XML documents, whose sizes range from 1M to 10M. The evaluation 
time measures both the time to parse XSLTs (and DTDs) as well as the time to 
carry out the transformation. As can be seen from Fig. 8(a), in this scale of source 
document sizes, MS XSLT tool performs best, followed by our SPM approach, 
then Saxon and Xalan-j. With the increase of document sizes, the gap lying 
between MS XSLT tool and SPM becomes less obvious. It is expected that the 
SPM approach will perform better than MS XSLT tool, if the input document 
is larger than 10M That is confirmed by the other group of experiments. 

The second group is a scalability test, in which we investigate the scalability 
of the SPM approach in comparison with MS XML, Saxon and Xalan-j using 
large XML documents, whose sizes range from 10 M to 100M Fig. 8(b) depicts 
the corresponding results. When the XML document is larger than 20M, Saxon 
processor reports out of the memory exception; so does Xalan-j processor, when 
the document is larger than 30M. Though MS XSLT tool docs not throw ex- 
ceptions, it causes extensive usage of virtual memory, which goes beyond 580M 



sometimes. Out of the 10 repeated runs of the same configuration, it reports 
deficiency in virtual memory once. As to the SPM approach, however, the eval- 
uation time grows approximately linearly with the increase of the document 
size. Fig. 8(b) clearly demonstrates that SPM improves XSLT evaluation 2 to 
10 times better than MS XML. Therefore, high scalability of our SPM approach 
is confirmed. 

5 Related Work and Conclusion 

The problem of incorporating XSLT processing into database engines is investi- 
gated in [9]. The approach in [7] generates a single SQL for an XSLT program 
and works for a large fragment of XSLT. [8] studies how to compose an XSLT 
transformation with an XML view. A major distinction of our work from [9, 7, 
8] is that their work is based on databases while we focus on processing XML 
documents directly. 

Much of the previous work is devoted to evaluating XPath queries over 
streaming XML[10, 3, 2, 5, 4, 1, 6]. However, to the best of our knowledge, our 
work is the first effort to study streaming processing in the context of XSLT 
evaluation. 

Our approach supports scalable XSLT evaluation. The SPM model offers 
several novel features that make it especially attractive for transforming large 
XML data. Our experimental results have clearly demonstrated the benefits of 
our approach. They show that the SPM approach outperforms current XSLT 
processors by an order of magnitude when dealing with large XML documents. 
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